In mathematics, mirror descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.
It generalizes algorithms such as gradient descent and multiplicative weights.
History
Mirror descent was originally proposed by Nemirovski and Yudin in 1983.
[Arkadi Nemirovsky and David Yudin. Problem Complexity and Method Efficiency in Optimization. John Wiley & Sons, 1983]
Motivation
In
gradient descent with the sequence of learning rates
applied to a differentiable function
, one starts with a guess
for a local minimum of
and considers the sequence
such that
This can be reformulated by noting that
In other words, minimizes the first-order approximation to at with added proximity term .
This squared Euclidean distance term is a particular example of a Bregman distance. Using other Bregman distances will yield other algorithms such as
/ref>
Formulation
We are given convex function
to optimize over a convex set
, and given some norm
on
.
We are also given differentiable convex function , -strongly convex with respect to the given norm. This is called the distance-generating function, and its gradient is known as the mirror map.
Starting from initial , in each iteration of Mirror Descent:
-
Map to the dual space:
-
Update in the dual space using a gradient step:
-
Map back to the primal space:
-
Project back to the feasible region : , where is the Bregman divergence.
Extensions
Mirror descent in the online optimization setting is known as Online Mirror Descent (OMD).
See also